fix(bedrock): wrap user text/image in guardContent to prevent tool result false positives by giulio-leone · Pull Request #1886 · strands-agents/sdk-python

giulio-leone · 2026-03-13T00:57:21Z

Summary

When Bedrock guardrails are enabled, tool results stored with role: "user" can trigger false-positive prompt injection detections. For example, a tool returning "You are Test Admin User." gets flagged as a prompt injection attack on subsequent messages.

Root Cause

Without guardrail_latest_message=True, no content blocks are wrapped in guardContent. The Bedrock guardrail then evaluates all message content — including tool results that happen to have role: "user" — leading to false positives on system-generated content.

Even with the existing _find_last_user_text_message_index fix (PR #1658), this only activates when guardrail_latest_message=True. The default behavior (False) leaves tool results exposed to guardrail scanning.

Fix

When guardrails are enabled (guardrail_id + guardrail_version are set), all user text and image content blocks are wrapped in guardContent. This signals the guardrail to evaluate only those blocks, excluding tool results (which contain toolResult blocks, not text/image) from scanning.

Behavior

`guardrail_latest_message`	Before	After
`True`	Only latest user text wrapped	Unchanged
`False` (default)	No wrapping — guardrail scans everything	All user text/image wrapped — tool results excluded

Tests

Added test_format_request_guardrail_default_wraps_all_user_text — verifies all user text is wrapped when guardrails enabled
Added test_format_request_guardrail_default_excludes_tool_results — reproduces the exact scenario from [BUG] Bedrock Guardrail False Positive on Tool Results #1671
Added test_format_request_no_guardrail_no_wrapping — verifies no wrapping without guardrails
Updated 2 existing config tests to reflect new wrapping behavior
All 128 bedrock tests pass

giulio-leone · 2026-03-16T05:31:33Z

Friendly ping — wraps user text/image content in guardContent format for Bedrock, preventing guardrail checks from treating tool results as user input.

…sult false positives When guardrails are enabled, tool results (role='user') containing text like 'You are Test Admin User.' can trigger false-positive prompt injection detections because the guardrail treats them as user input. The fix wraps ALL user text/image content blocks in guardContent when guardrails are enabled (not just when guardrail_latest_message=True). This signals the guardrail to evaluate ONLY those blocks, excluding tool results from scanning. Behavior change: - guardrail_latest_message=True: unchanged (only latest user text wrapped) - guardrail_latest_message=False (default): all user text/image wrapped, tool results excluded from guardrail scanning Closes strands-agents#1671

giulio-leone · 2026-03-23T06:10:05Z

Refreshed onto main @ fd8168a (v1.32.0+2) — 2026-03-23

Root cause confirmed still live: The guardrail wrapping logic only applies when guardrail_latest_message=True (opt-in). Without it, tool results (which also carry role="user") are sent to Bedrock guardrails unwrapped — causing false-positive prompt injection detections on system-generated content (see #1671).

Fix: Introduce has_guardrail pre-check (guardrail_id + guardrail_version both present). When guardrail_latest_message=False (the default), wrap all user text/image blocks in guardContent — but never wrap toolResult blocks. When guardrail_latest_message=True, preserve the existing "only latest user message" behavior.

Runtime proof on rebased branch ef07544:

role=user  block=text        guardrail_wrapped=True   ← user message correctly wrapped
role=user  block=toolResult  guardrail_wrapped=False  ← tool result correctly excluded

Test results:

25 targeted guardrail/guard-content tests: 25/25 PASSED
Full Bedrock model test suite: 126/126 PASSED

github-actions bot added the size/m label Mar 13, 2026

giulio-leone requested a deployment to manual-approval March 13, 2026 00:57 — with GitHub Actions Waiting

giulio-leone force-pushed the fix/guardrail-tool-result-false-positive branch from 9c5cdb1 to 6114956 Compare March 15, 2026 16:12

github-actions bot added size/m and removed size/m labels Mar 15, 2026

giulio-leone requested a deployment to manual-approval March 15, 2026 16:12 — with GitHub Actions Waiting

giulio-leone force-pushed the fix/guardrail-tool-result-false-positive branch from 6114956 to ef07544 Compare March 23, 2026 06:10

github-actions bot added size/m and removed size/m labels Mar 23, 2026

giulio-leone requested a deployment to manual-approval March 23, 2026 06:10 — with GitHub Actions Waiting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(bedrock): wrap user text/image in guardContent to prevent tool result false positives#1886

fix(bedrock): wrap user text/image in guardContent to prevent tool result false positives#1886
giulio-leone wants to merge 1 commit intostrands-agents:mainfrom
giulio-leone:fix/guardrail-tool-result-false-positive

giulio-leone commented Mar 13, 2026

Uh oh!

giulio-leone commented Mar 16, 2026

Uh oh!

giulio-leone commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

giulio-leone commented Mar 13, 2026

Summary

Root Cause

Fix

Behavior

Tests

Uh oh!

giulio-leone commented Mar 16, 2026

Uh oh!

giulio-leone commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant